Optimizing K-Means by Fixing Initial Cluster Centers
نویسندگان
چکیده
Data mining techniques help in business decision making and predicting behaviors and future trends. Clustering is a data mining technique used to make groups of objects that are somehow similar in characteristics. Clustering analyzes data objects without consulting a known class label or category i.e. it is an unsupervised data mining technique. Kmeans is a widely used partitional clustering algorithm but the performance of K-means strongly depends on the initial guess of centers (centroid) and the final cluster centroids may not be the optimal ones. Therefore it is important for Kmeans to have good choice of initial centroids. By augmenting K-means with a technique of selecting centroids using criteria of sum of distances of data objects to all other data objects, we obtain an algorithm Farthest Distributed Centroids Clustering (FDCC) that result in better clustering as compared to not only the K-means partition clustering algorithm but also to the agglomerative hierarchical clustering algorithm and Hierarchical partitioning clustering algorithm. Unlike K-means FDCC algorithm does not perform random generation of the initial centers and does not produce different results for the same input data.
منابع مشابه
A Genetic K-means Clustering Algorithm Based on the Optimized Initial Centers
An optimized initial center of K-means algorithm(PKM) is proposed, which select the k furthest distance data in the high-density area as the initial cluster centers. Experiments show that the algorithm not only has a weak dependence on the initial data, but also has fast convergence and high clustering quality. To obtain effective cluster and accurate cluster, we combine the optimized K-means a...
متن کاملModified K-Means for Better Initial Cluster Centres
The k-means clustering algorithm is most popularly used in data mining for real world applications. The efficiency and performance of the k-means algorithm is greatly affected by initial cluster centers as different initial cluster centers often lead to different clustering. In this paper, we propose a modified k-means algorithm which has additional steps for selecting better cluster centers. W...
متن کاملImproved Fuzzy Art Method for Initializing K-means
The K-means algorithm is quite sensitive to the cluster centers selected initially and can perform different clusterings depending on these initialization conditions. Within the scope of this study, a new method based on the Fuzzy ART algorithm which is called Improved Fuzzy ART (IFART) is used in the determination of initial cluster centers. By using IFART, better quality clusters are achieved...
متن کاملA new algorithm for choosing initial cluster centers for k-means
The k-means algorithm is widely used in many applications due to its simplicity and fast speed. However, its result is very sensitive to the initialization step: choosing initial cluster centers. Different initialization algorithms may lead to different clustering results and may also affect the convergence of the method. In this paper, we propose a new algorithm for improving the initializatio...
متن کاملA New Initialization Method to Originate Initial Cluster Centers for K-Means Algorithm
K means algorithm is most popular partition based algorithm that is widely used in data clustering. A Lot of algorithms have been proposed for data clustering using K-Means algorithm due to its simplicity, efficiency and ease convergence. In spite this K-Means algorithm has some drawbacks like initial cluster centers, stuck in local optima etc. In this study, a new method is proposed to address...
متن کامل